This report aims to raise awareness about the tragic reality of child deaths caused by diarrhoea in children under five. Although preventable and treatable condition, it remains one of the leading causes of child mortality in many low-income countries, where access to healthcare is limited.
The narrative voice of this report is that of Najir, a 4-year-old child who tells his story: from when he tells his mother he has a stomach ache, to when, despite efforts to find a cure, he sadly does not survive. Although this is a fictional story, it is plausible enough to reflect what happens in countries with low access to diarrhoea care, where the inadequacy of healthcare resources is tied to economic and social inequalities.
Data Sources
The data used in this report comes from the following datasets:
unicef_indicator_1.csv: provides global data on the percentage of children under five with diarrhoea who can access healthcare facilities
unicef_indicator_2.csv: offers global data on the percentage of healthcare facilities with basic water access
unicef_metadata.csv: includes economic and social data
deaths_caused_by_diarrhoea.csv: provides data on diarrhoea-related child mortality categorized by seven regions. You can download it here.
Please find the Tableau version of this report here.
Najir’s Story
This is me
My name is Najir, and I am four years old.
I do not feel well. A sharp pain grips my stomach, and my mother tells me I have diarrhoea. She holds my hand tightly, her face etched with worry, her eyes filled with silent fear. “We must find help,” she says. However, seeking medical care is not a simple task.
I live in one of the nineteen countries where, according to data collected between 2017 and 2023, among all the children under the age of five suffering from diarrhoea, fewer than 40% have access to medical advice or treatment at a healthcare facility or from a trained provider.
Code
import pandas as pdfrom plotnine import ggplot, aes, geom_bar, labs, theme_minimal, theme, element_text, scale_fill_gradientdf = pd.read_excel("unicef_indicator_1_filtered_firstgraph.xlsx")bar_chart = ( ggplot(df, aes(x='country', y='access to care for diarrhoea', fill='access to care for diarrhoea')) + geom_bar(stat='identity', show_legend=False, width=0.2) + scale_fill_gradient(low='#D5006D', high='#FF80AB') + labs( title='Healthcare access for children with diarrhoea in lowest-access countries', x='Country', y='Access to Care for Diarrhoea (%)' ) + theme_minimal() + theme( axis_text_x=element_text(rotation=45, ha='right'), figure_size=(10, 5), plot_title=element_text(size=14, weight='bold'), axis_title=element_text(size=12) ))# Show the chartbar_chart
A serious public health crisis is evident: the limited accessibility and availability of essential healthcare services in specific areas of the world.
Despite being a preventable and treatable condition, diarrhoea remains one of the leading causes of child mortality in countries where healthcare systems are fragile. According to the 2021 data, diarrhoea accounted for 9% of all deaths in children under 5 — over 1,200 daily, or 440,000 annually.
Mortality is highest in Sub-Saharan Africa.
Code
import pandas as pdimport plotly.express as px# Carica il datasetdf = pd.read_csv("unicef_indicator_2 (2).csv")# Filtra i dati per l'anno 2022 e unità di misura in %df_2021 = df[ (df["time_period"] ==2021) & (df["unit_of_measure"] =="%")]# Crea una colonna formattata per l'hoverdf_2021["% Water Access"] = df_2021["obs_value"].apply(lambda x: f"{x:.1f}%")# Crea la mappafig = px.choropleth( df_2021, locations="alpha_3_code", color="obs_value", hover_name="country", hover_data={"alpha_3_code": False,"obs_value": False,"% Water Access": True }, color_continuous_scale="Blues", title="<b>Health care facilities & water access: country comparison 2021</b>")fig.update_layout( coloraxis_colorbar=dict( title="Water Access (%)", tickformat=".1f" ))fig.show()
C:\Users\user\AppData\Local\Temp\ipykernel_6004\812227575.py:14: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
Fortunately, my mother managed to find a healthcare facility and took me there.
However, the healthcare facilities with access to basic water services is not the same across the world.
The lack of access to basic water services means that healthcare facilities are unable to treat children suffering from diarrhoea.
import pandas as pdfrom plotnine import ggplot, aes, geom_line, labs, theme_minimal, theme, element_text, scale_color_manual# Carica i datidf = pd.read_csv("unicef_metadata_7.csv")# Definisci i gruppi di paesihigh_birth = ["Niger", "Chad", "Somalia", "Central African Republic", "Mali","Congo", "Nigeria", "Uganda", "Mozambique", "Benin"]low_birth = ["South Korea", "Hong Kong", "Puerto Rico", "Japan", "Andorra", "San Marino", "Italy", "China", "Spain", "Greece"]# Assegna etichette personalizzatedf["group"] = df["country"].apply(lambda x: "Top 10 High Birth Rate Countries"if x in high_birth else ("Top 10 Low Birth Rate Countries"if x in low_birth elseNone))# Filtra solo i gruppi rilevantidf_grouped = df[df["group"].notna()]# Elimina righe con valori nullidf_grouped = df_grouped.dropna(subset=["Birth rate, crude (per 1,000 people)"])# Calcola la media annuale per ogni gruppomedia_anni = ( df_grouped .groupby(["year", "group"])["Birth rate, crude (per 1,000 people)"] .mean() .reset_index())# Crea il graficografico = ( ggplot(media_anni, aes(x="year", y="Birth rate, crude (per 1,000 people)", color="group")) + geom_line(size=1.5) + scale_color_manual(values={"Top 10 High Birth Rate Countries": "#FF69B4", # rosa"Top 10 Low Birth Rate Countries": "#800080"# viola }) + labs( title="Birth Rate Trends: High vs Low Rate Countries", x="Year", y="Average Birth Rate (per 1,000 people)", color="Country Group" ) + theme_minimal() + theme( plot_title=element_text(size=14, weight='bold', color="#4B0082"), # viola più scuro axis_title=element_text(size=12), legend_title=element_text(size=11), legend_text=element_text(size=10), figure_size=(8, 6) # grafico più stretto ))grafico
Code
import pandas as pdimport plotly.express as px# Carica il datasetdf = pd.read_csv("unicef_metadata_7.csv")# Filtra i dati per l'anno 2022 e puliscidf_2022 = df[df["year"] ==2022]df_cleaned = df_2022.dropna(subset=["GDP per capita (constant 2015 US$)", "Birth rate, crude (per 1,000 people)"])df_cleaned["Birth rate, crude (per 1,000 people)"] = df_cleaned["Birth rate, crude (per 1,000 people)"].round(1)# Crea il grafico con le personalizzazioni richiestefig = px.scatter( df_cleaned, x="GDP per capita (constant 2015 US$)", y="Birth rate, crude (per 1,000 people)", title="<b>GDP per capita and its effect on birth rate</b>", # Titolo in grassetto labels={"GDP per capita (constant 2015 US$)": "GDP per capita (USD)","Birth rate, crude (per 1,000 people)": "Birth rate (per 1,000 people)" }, color_discrete_sequence=["#FDB79A"], # Colore rosa per i punti trendline="lowess", trendline_color_override="#FF69B4"# Linea viola)# Personalizzazione avanzatafig.update_traces( mode='markers', marker=dict( size=8, opacity=0.7, line=dict(width=1, color='DarkSlateGrey') ), selector=dict(mode='markers'))# Migliora l'aspetto della linea di tendenzafig.update_traces( line=dict(width=4), selector=dict(type='scatter', mode='lines'))# Aggiorna il layout con le nuove specifichefig.update_layout( title={'text': "<b>GDP per capita and its effect on birth rate</b>",'y':0.9,'x':0.5,'xanchor': 'center','yanchor': 'top','font': dict( size=18, color='#800080'# Titolo viola ) }, xaxis_title="GDP per capita (USD)", yaxis_title="Birth rate (per 1,000 people)", template="plotly_white", font=dict( family="Arial", size=12, color="black"# Testo assi in nero ), hovermode="closest", showlegend=False, xaxis=dict(range=[0, 100000], tickvals=[0, 20000, 40000, 60000, 80000, 100000], tickformat=",", title_font=dict(color="black"), # Asse X in nero tickfont=dict(color="black") ), yaxis=dict(range=[0, 45], tickvals=[0, 5, 10, 15, 20, 25, 30, 35, 40, 45], title_font=dict(color="black"), # Asse Y in nero tickfont=dict(color="black") ))# Aggiungi annotazione fonte datifig.add_annotation( x=0.5, y=-0.15, xref="paper", yref="paper", text="Fonte: Dati UNICEF 2022", showarrow=False, font=dict( size=10, color="grey" ))fig
C:\Users\user\AppData\Local\Temp\ipykernel_6004\2718777199.py:10: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
Code
import pandas as pd# Carica il filedf = pd.read_csv('unicef_metadata_7.csv')# Assicurati che il GDP per capita sia numericodf['GDP per capita (constant 2015 US$)'] = pd.to_numeric(df['GDP per capita (constant 2015 US$)'], errors='coerce')# Filtra solo l'anno 2022df_2022 = df[df['year'] ==2022]# Elimina i valori mancanti di GDP per capitadf_2022 = df_2022.dropna(subset=['GDP per capita (constant 2015 US$)'])# Trova i 10 paesi con GDP per capita più altotop10_highest_countries = df_2022.sort_values(by='GDP per capita (constant 2015 US$)', ascending=False).head(10)['country'].tolist()# Trova i 10 paesi con GDP per capita più bassotop10_lowest_countries = df_2022.sort_values(by='GDP per capita (constant 2015 US$)', ascending=True).head(10)['country'].tolist()# Stampa solo le liste dei paesiprint("TOP 10 paesi con GDP per capita più ALTO nel 2022:")print(top10_highest_countries)print("\nTOP 10 paesi con GDP per capita più BASSO nel 2022:")print(top10_lowest_countries)
TOP 10 paesi con GDP per capita più ALTO nel 2022:
['Monaco', 'Bermuda', 'Luxembourg', 'Ireland', 'Switzerland', 'Cayman Islands', 'Norway', 'Singapore', 'United States', 'Qatar']
TOP 10 paesi con GDP per capita più BASSO nel 2022:
['Burundi', 'Afghanistan', 'Central African Republic', 'Madagascar', 'Somalia', 'Congo, the Democratic Republic of the', 'Malawi', 'Niger', 'Chad', 'Mozambique']
Code
# Librariesimport pandas as pdimport plotly.express as px# Load the datasetdf = pd.read_csv("unicef_indicator_1 (2).csv")# Lista dei 10 paesi che ti interessanocountries_of_interest = ['Burundi', 'Afghanistan', 'Central African Republic', 'Madagascar', 'Somalia', 'Congo, the Democratic Republic of the', 'Malawi', 'Niger', 'Chad', 'Mozambique']# Filtro il dataset: solo maschi e femmine, e solo i 10 paesidf_filtered = df[ (df["sex"].isin(["Female", "Male"])) & (df["country"].isin(countries_of_interest))]# Calcolo la media per anno e sessodf_grouped = df_filtered.groupby(["time_period", "sex"], as_index=False)["obs_value"].mean()# Creo il grafico a lineefig = px.line( df_grouped, x="time_period", y="obs_value", color="sex", title="<b>Healthcare Access for Children with Diarrhea (Top 10 Poorest Countries)</b>", labels={"time_period": "Year","obs_value": "Average Value (%)","sex": "Sex" }, color_discrete_map={"Female": "#FF69B4", # Rosa"Male": "#00BFFF"# Azzurro })# Miglioro l'aspettofig.update_traces(mode="lines+markers") # Linee con pallinifig.update_layout( template="plotly_white", xaxis=dict(tickmode="linear"), # Mostra tutti gli anni yaxis=dict(range=[0, 100]) # Range percentuale 0-100)fig